Unsupervised Resolution of Objects and Relations on the Web

نویسندگان

  • Alexander Yates
  • Oren Etzioni
چکیده

The task of identifying synonymous relations and objects, or Synonym Resolution (SR), is critical for high-quality information extraction. The bulk of previous SR work assumed strong domain knowledge or hand-tagged training examples. This paper investigates SR in the context of unsupervised information extraction, where neither is available. The paper presents a scalable, fully-implemented system for SR that runs in O(KN log N) time in the number of extractions N and the maximum number of synonyms per word, K. The system, called RESOLVER, introduces a probabilistic relational model for predicting whether two strings are co-referential based on the similarity of the assertions containing them. Given two million assertions extracted from the Web, RESOLVER resolves objects with 78% precision and an estimated 68% recall and resolves relations with 90% precision and 35% recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Methods for Determining Object and Relation Synonyms on the Web

The task of identifying synonymous relations and objects, or synonym resolution, is critical for high-quality information extraction. This paper investigates synonym resolution in the context of unsupervised information extraction, where neither hand-tagged training examples nor domain knowledge is available. The paper presents a scalable, fullyimplemented system that runs in O(KN log N) time i...

متن کامل

ارزیابی تطبیقی کارایی ساختار فراداده نظام‌های شناسگر دیجیتالی

The main solution to the problems of persistency and uniqueness in identification of digital objects in a web environment is provided by using digital identifiers instead of URL. The main basis of this solution is resolution mechanism that is used in digital identifier systems. Resolution is the use of indirect names instead of URLs; what worked for the DNS (Domain Name System) in stabilizing i...

متن کامل

Object-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images

As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...

متن کامل

Kohonen Self Organizing for Automatic Identification of Cartographic Objects

Automatic identification and localization of cartographic objects in aerial and satellite images have gained increasing attention in recent years in digital photogrammetry and remote sensing. Although the automatic extraction of man made objects in essence is still an unresolved issue, the man made objects can be extracted from aerial photos and satellite images. Recently, the high-resolution s...

متن کامل

مدل‌سازی روابط توپولوژیک سه بعدی فازی در محیط GIS

Nowadays, geospatial information systems (GIS) are widely used to solve different spatial problems based on various types of fundamental data: spatial, temporal, attribute and topological relations. Topological relations are the most important part of GIS which distinguish it from the other kinds of information technologies. One of the important mechanisms for representing topological relations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007